10 research outputs found
LSTM-TrajGAN: A Deep Learning Approach to Trajectory Privacy Protection
The prevalence of location-based services contributes to the explosive growth of individual-level trajectory data and raises public concerns about privacy issues. In this research, we propose a novel LSTM-TrajGAN approach, which is an end-to-end deep learning model to generate privacy-preserving synthetic trajectory data for data sharing and publication. We design a loss metric function TrajLoss to measure the trajectory similarity losses for model training and optimization. The model is evaluated on the trajectory-user-linking task on a real-world semantic trajectory dataset. Compared with other common geomasking methods, our model can better prevent users from being re-identified, and it also preserves essential spatial, temporal, and thematic characteristics of the real trajectory data. The model better balances the effectiveness of trajectory privacy protection and the utility for spatial and temporal analyses, which offers new insights into the GeoAI-powered privacy protection
LOWA: Localize Objects in the Wild with Attributes
We present LOWA, a novel method for localizing objects with attributes
effectively in the wild. It aims to address the insufficiency of current
open-vocabulary object detectors, which are limited by the lack of
instance-level attribute classification and rare class names. To train LOWA, we
propose a hybrid vision-language training strategy to learn object detection
and recognition with class names as well as attribute information. With LOWA,
users can not only detect objects with class names, but also able to localize
objects by attributes. LOWA is built on top of a two-tower vision-language
architecture and consists of a standard vision transformer as the image encoder
and a similar transformer as the text encoder. To learn the alignment between
visual and text inputs at the instance level, we train LOWA with three training
steps: object-level training, attribute-aware learning, and free-text joint
training of objects and attributes. This hybrid training strategy first ensures
correct object detection, then incorporates instance-level attribute
information, and finally balances the object class and attribute sensitivity.
We evaluate our model performance of attribute classification and attribute
localization on the Open-Vocabulary Attribute Detection (OVAD) benchmark and
the Visual Attributes in the Wild (VAW) dataset, and experiments indicate
strong zero-shot performance. Ablation studies additionally demonstrate the
effectiveness of each training step of our approach
Exploring the effectiveness of geomasking techniques for protecting the geoprivacy of Twitter users
With the ubiquitous use of location-based services, large-scale individual-level location data has been widely collected through location-awareness devices. Geoprivacy concerns arise on the issues of user identity de-anonymization and location exposure. In this work, we investigate the effectiveness of geomasking techniques for protecting the geoprivacy of active Twitter users who frequently share geotagged tweets in their home and work locations. By analyzing over 38,000 geotagged tweets of 93 active Twitter users in three U.S. cities, the two-dimensional Gaussian masking technique with proper standard deviation settings is found to be more effective to protect user\u27s location privacy while sacrificing geospatial analytical resolution than the random perturbation masking method and the aggregation on traffic analysis zones. Furthermore, a three-dimensional theoretical framework considering privacy, analytics, and uncertainty factors simultaneously is proposed to assess geomasking techniques. Our research offers insights into geoprivacy concerns of social media users\u27 georeferenced data sharing for future development of location-based applications and services
Evaluation and Mitigation of Agnosia in Multimodal Large Language Models
While Multimodal Large Language Models (MLLMs) are widely used for a variety
of vision-language tasks, one observation is that they sometimes misinterpret
visual inputs or fail to follow textual instructions even in straightforward
cases, leading to irrelevant responses, mistakes, and ungrounded claims. This
observation is analogous to a phenomenon in neuropsychology known as Agnosia,
an inability to correctly process sensory modalities and recognize things
(e.g., objects, colors, relations). In our study, we adapt this similar concept
to define "agnosia in MLLMs", and our goal is to comprehensively evaluate and
mitigate such agnosia in MLLMs. Inspired by the diagnosis and treatment process
in neuropsychology, we propose a novel framework EMMA (Evaluation and
Mitigation of Multimodal Agnosia). In EMMA, we develop an evaluation module
that automatically creates fine-grained and diverse visual question answering
examples to assess the extent of agnosia in MLLMs comprehensively. We also
develop a mitigation module to reduce agnosia in MLLMs through multimodal
instruction tuning on fine-grained conversations. To verify the effectiveness
of our framework, we evaluate and analyze agnosia in seven state-of-the-art
MLLMs using 9K test samples. The results reveal that most of them exhibit
agnosia across various aspects and degrees. We further develop a fine-grained
instruction set and tune MLLMs to mitigate agnosia, which led to notable
improvement in accuracy
A Mobile Outdoor Augmented Reality Method Combining Deep Learning Object Detection and Spatial Relationships for Geovisualization
The purpose of this study was to develop a robust, fast and markerless mobile augmented reality method for registration, geovisualization and interaction in uncontrolled outdoor environments. We propose a lightweight deep-learning-based object detection approach for mobile or embedded devices; the vision-based detection results of this approach are combined with spatial relationships by means of the host device’s built-in Global Positioning System receiver, Inertial Measurement Unit and magnetometer. Virtual objects generated based on geospatial information are precisely registered in the real world, and an interaction method based on touch gestures is implemented. The entire method is independent of the network to ensure robustness to poor signal conditions. A prototype system was developed and tested on the Wuhan University campus to evaluate the method and validate its results. The findings demonstrate that our method achieves a high detection accuracy, stable geovisualization results and interaction